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Abstract 

Constrained submodular maximization problems have long been studied, most recently in the context of auc- 
tions and computational advertising, with near-optimal results known under a variety of constraints when the 
submodular function is monotone. The case of non-monotone submodular maximization is less well understood: 
the first approximation algorithms even for the unconstrained setting were given by Feige et al. (FOCS '07). More 
recently, Lee et al. (STOC '09, APPROX '09) show how to approximately maximize non-monotone submodular 
functions when the constraints are given by the intersection of p matroid constraints; their algorithm is based on 
local-search procedures that consider p-swaps, and hence the running time may be n^^^'''\ implying their algorithm 
is polynomial-time only for constantly many matroids. 

In this paper, we give algorithms that work for p-independence systems (which generalize constraints given 
by the intersection of p matroids), where the running time is poly(77,,p). Both our algorithms and analyses are 
simple: our algorithm essentially reduces the non-monotone maximization problem to multiple runs of the greedy 
algorithm previously used in the monotone case. Our idea of using existing algorithms for monotone functions 
to solve the non-monotone case also works for maximizing a submodular function with respect to a knapsack 
constraint: we get a simple greedy-based constant-factor approximation for this problem. 

With these simpler algorithms, we are able to adapt our approach to constrained non-monotone submodular 
maximization to the (online) secretary setting, where elements arrive one at a time in random order, and the 
algorithm must make irrevocable decisions about whether or not to select each element as it arrives. We give 
constant approximations in this secretary setting when the algorithm is constrained subject to a uniform matroid 
or a partition matroid, and give an 0(log k) approximation when it is constrained by a general matroid of rank k. 
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1 Introduction 



We present algorithms for maximizing (not necessarily monotone) non-negative submodular functions satisfying 
/(0) = under a variety of constraints considered earlier in the literature. Lee et al. [ LMNS10| , LSV09 | gave the 
first algorithms for these problems via local-search algorithms: in this paper, we consider greedy approaches that 
have been successful for monotone submodular maximization, and show how these algorithms can be adapted very 
simply to non-monotone maximization as well. Using this idea, we show the following results: 

• We give an 0(p) -approximation for maximizing submodular functions subject to a p-independence system. 
This extends the result of Lee et al. [ LMNSIC , ^SV09] which applied to constraints given by the intersection 
of p matroids, where p was a constant. (Intersections of p matroids give p-indep. systems, but the converse is 
not true.) Our greedy -based algorithm has a run-time polynomial in p, and hence gives the first polynomial- 
time algorithms for non-constant values of p. 

• We give a constant-factor approximation for maximizing submodular functions subject to a knapsack con- 
straint. This greedy-based algorithm gives an alternate approach to solve this problem; Lee et al. [ ]LMNS10| ] 
gave LP-rounding-based algorithms that achieved a (5 + e)-approximation algorithm for constraints given by 
the intersection of p knapsack constraints, where p is a constant. 

Armed with simpler greedy algorithms for nonmonotone submodular maximization, we are able to perform con- 
strained nonmonotone submodular maximization in several special cases in the secretary setting as well: when items 
aiTive online in random order, and the algorithm must make iiTcvocable decisions as they amve. 

• We give an 0(l)-approximation for maximizing submodular functions subject to a cardinality constraint and 
subject to a partition matroid. (Using a reduction of [ BDG^09| ], the latter implies 0(l)-approximations to 
e.g., graphical matroids.) Our secretary algorithms are simple and efficient. 

• We give an 0(log A;)-approximation for maximizing submodular functions subject to an arbitrary rank k ma- 
troid constraint. This matches the known bound for the matroid secretary problem, in which the function to 
be maximized is simply Unear. 

No prior results were known for submodular maximization in the secretary setting, even for monotone submodular 
maximization; there is some independent work, see ^1.3.1 for details. 

Compared to previous offline results, we trade off small constant factors in our approximation ratios of our 
algorithms for exponential improvements in run time: maximizing nonmonotone submodular functions subject to 

2 

(constant) p > 2 matroid constraints currentiy has a + e) approximation due to a paper of Lee, Sviridenko 
and Vondrak [ LSV09 |, using an algorithm with run-time exponential in p. For p = 1 the best result is a 3.23- 



approximation by Vondrak OVonOQQ . In contrast, our algorithms have run time only linear in p, but our approximation 
factors are worse by constant factors for the small values of p where previous results exist. We have not tried to 
optimize our constants, but it seems likely that matching, or improving on the previous results for constant p will 
need more than just choosing the parameters carefully. We leave such improvements as an open problem. 



1.1 Submodular Maximization and Secretary Problems in an Economic Context 

Submodular maximization and secretary problems have both been widely studied in their economic contexts. The 
problem of selecting a subset of people in a social network to maximize their influence in a viral marketing campaign 



can be modeled as a constrained submodular maximization problem [ |KKT03| , |MR07| ]. When costs are introduced, 
the influence minus the cost gives us non-monotone submodular maximization problems; prior to this work, online 
algorithms for non-monotone submodular maximization problems were not known. Asadpour et al. studied the 
problem of adaptive stochastic (monotone) submodulai^ maximization with applications to budgeting and sensor 
placement [ ANS08| ], and Agrawal et al. showed that the correlation gap of submodular functions was bounded by 
a constant using an elegant cost-sharing argument, and related this result to social welfare maximizing auctions 
[ADSYOS]. Finally, secretary problems, in which elements arriving in random order must be selected so as to 
maximize some constrained objective function have well-known connections to online auctions [Kle05, BIK07| , 
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BIKK07 , HKP04 ]. Our simpler offline algorithms allow us to generalize these results to give the first secretary 
algorithms capable of handling a non-monotone submodular objective function. 



1.2 Our Main Ideas 



At a high level, the simple yet crucial observation for the offline results is this: many of the previous algorithms and 
proofs for constrained monotone submodular maximization can be adapted to show that the set S produced by them 
satisfies f{S) > l3f{S U C*), for some < /3 < 1, and C* being an optimal solution. In the monotone case, the 
right hand side is at least f{C*) = OPT and we are done. In the non-monotone case, we cannot do this. However, 
we observe that if /(S' n C*) is a reasonable fraction of OPT, then (approximately) finding the most valuable set 
within S would give us a large value — and since we work with constraints that are downwai^ds closed, finding such a 



set is just unconstrained maximization on /(•) restricted to S, for which Feige et al. [FMV07| give good algorithms! 
On the other hand, if f{S n C*) < eOPT and f{S) is also too small, then one can show that deleting the elements 
in S and running the procedure again to find another set 5" C Q,\S with f{S') > PfiS' D (C* \ 5)) would 
guarantee a good solution! Details for the specific problems appear in the following sections; we first consider the 



simplest cardinality constraint case in Section 2 to illustrate the general idea, and then give more general results in 



Sections 3.1 and 



For the secretary case where the elements arrive in random order, algorithms were not known for the monotone 
case either — the main complication being that we cannot run a greedy algorithm (since the elements are arriving 
randomly), and moreover the value of an incoming element depends on the previously chosen set of elements. 
Furthermore, to extend the results to the non-monotone case, one needs to avoid the local-seaich algorithms (which, 
in fact, motivated the above results), since these algorithms necessarily implement multiple passes over the input. 



while the secretary model only allows a single pass over it. The details on all these ai^e given in Section 4 



1.3 Related Work 

Monotone Submodular Maximization. The (offline) monotone submodular optimization problem has been long 
studied: Fisher, Nemhauser, and Wolsey [ ]NWF78| , FNW78 ] showed that the greedy and local-search algorithms 
give a (e/e — 1) -approximation with cardinality constraints, and a. {p + 1) -approximation under p matroid con- 



straints. In another line of work, [Ien76, KH78, HKJ8C ] showed that the greedy algorithm is a ^-approximation for 
maximizing a modular (i.e., additive) function subject to a p-independence system. This proof extends to show a 
(p+ 1) -approximation for monotone submodular functions under the same constraints (see, e.g., [ |CCPV09 ]). A long 
standing open problem was to improve on these results; nothing better than a 2-approximation was known even for 



monotone maximization subject to a single partition matroid constraint. Calinescu et al. [CCPV07| showed how to 
maximize monotone submodular functions representable as weighted matroid rank functions subject to any matioid 
with an approximation ratio of (e/e — 1), and soon thereafter, Vondrak extended this result to all submodular func- 



tions [Von08]; these highly influential results appear jointly in [CCPV09|. Subsequently, Lee et al. [LSV09| give 



algorithms that beat the (p + l)-bound for p matroid constraints with p > 2 to get a + e)-approximation. 

Knapsack constraints. Sviridenko [ )Svi04| ] extended results of Wolsey [ ]Wol82[ ] and Khuller et al. [ |KMN99| ] to 
show that a greedy-like algorithm with partial enumeration gives an (e/e — 1) -approximation to monotone sub- 
modular maximization subject to a knapsack constraint. Kulik et al. | |KST09| ] showed that one could get essen- 
tially the same approximation subject to a constant number of knapsack constraints. Lee et al. [^MNSTC] give a 
5-approximation for the same problem in the non-monotone case. 



Mixed Matroid-Knapsack Constraints. Chekuri et al. [ C VZ09| ] give strong concentration results for dependent 
randomized rounding with many applications; one of these applications is a ((e/e — 1) — e)-approximation for 



monotone maximization with respect to a matroid and any constant number of knapsack constraints. [GNR09, Sec- 
tion F. 1] extends ideas from [ pK05 1 to give polynomial-time algorithms with respect to non-monotone submodular 
maximization with respect to a p-system and q knapsacks: these algorithms achieve wp + q + 0(l)-approximation 
for constant q (since the running time is nP°'5'(3))^ or a (p + 2){q + 1) -approximation for arbitrary q; at a high level, 
their idea is to "emulate" a knapsack constraint by a polynomial number of partition matroid constraints. 
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Non-Monotone Submodular Maximization. In the non-monotone case, even the unconstrained problem is NP- 
hard (it captures max-cut). Feige, Mirrokni and Vondrak [ pMV07 ] first gave constant-factor approximations for this 
problem. Lee et al. [ f^MNSlO ] gave the first approximation algorithms for constrained non-monotone maximization 
(subject to p matroid constraints, or p knapsack constraints); the approximation factors were improved by Lee et 



al. [LSV09p. The algorithms in the previous two papers are based on local-search with p-swaps and would take 
time. Recent work by Vondrak [ Vbn09 ] gives much further insight into the approximability of submodular 
maximization problems. 

Secretary Problems. The original secretary problem seeks to maximize the probability of picking the element in 



a collection having the highest value, given that the elements are examined in random order [Dyn63, Fre83 , Fer89|. 



The problem was used to model item-pricing problems by Hajiaghayi et al. [ ]HKP04| ]. Kleinberg [ ]Kle05| ] showed that 
the problem of maximizing a modular function subject to a cardinality constraint in the secretary setting admits a 
(1 + -approximation, where k is the cardinality. (We show that maximizing a submodular function subject to a 
cardinality constraint cannot be approximated to better than some universal constant, independent of the value of k.) 



Babaioff et al. [BIK07| wanted to maximize modular functions subject to matroid constraints, again in a secretary- 
setting, and gave constant-factor approximations for some special matroids, and an 0(logA;) approximation for 



general matroids having rank k. This line of research has seen several developments recently [BIKK07, DPi 



KP09, BDG^]. 



1.3.1 Independent Work on Submodular Secretaries 

Concurrently and independently of our work, Bobby Kleinberg has given an algorithm similar to that in 



nfor 



monotone secretary submodular maximization under a cardinality constraint [ |Kle09| ]. Again independently, Bateni 
et al. consider the problem of non-monotone submodular maximization in the secretary setting [ BHZ10| ]; they give a 
different 0(l)-approximation subject to a cardinality constraint, an 0{L log^ A;) -approximation subject to L matroid 
constraints, and an 0(L)-approximation subject to L knapsack constraints in the secretary setting. While we do 
not consider multiple constraints, it is easy to extend our results to obtain 0(L log k) and 0{L) respectively using 
standard techniques. 



1.4 Preliminaries 

Given a set S and an element e, we use 5 + e to denote S U {e}. A function / : 2^ — )• is submodular if for 
all 5, T C 0, f{S) + /(T) > f{S U T) + f{S n T). Equivalentiy, / is submodular- if it has decreasing marginal 
utility: i.e., for all 5 C T C J], and for all e G f{S + e) - f{S) > f{T + e) - /(T). Also, / is called monotone 
if f{S) < f{T) for 5 C T. Given / and 5 C Q, define /s : 2^ ^ R as fs{A) := f{SuA) - f{S). The following 
facts are standard. 

Proposition 1.1. If f is submodular with /(0) = 0, then 

• for any S, fs is submodular with /s(0) = 0, and 

• / is also subadditive; i.e., for disjoint sets A, B, we have f{A) + f{B) > f{AU B). 

Matroids. A matroid is a pair 7W = C 2^), where X contains 0, if A G X and B Q A then B £ Z, and for 

every A, B £ Z with \A\ < \B\, there exists e € B \ A such that A + e £ Z. The sets in Z are called independent, 
and the rank of a matroid is the size of any maximal independent set (base) in A4. In a uniform matroid, Z contains 
all subsets of size at most k. A partition matroid, we have groups gi, 92, ■ ■ ■ , gk ^ ^ with gi r\gj = $ and VJjgj = 0.; 
the independent sets are 5 C such that jS* n gij < 1. 

Unconstrained (Non-Monotone) Submodular Maximization. We use FMVa(5) to denote an approximation 
algorithm given by Feige, MiiTokni, and Vondrak [ FM V07| ] for unconstrained submodular maximization in the non- 
monotone setting: it returns a set T C S such that f{T) > ^ maxT'cs f{T'). In fact, Feige et al. present many 
such algorithms, the best approximation ratio among these is a = 2.5 via a local-search algorithm, the easiest is a 
4-approximation that just returns a uniformly random subset of S. 
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2 Submodular Maximization subject to a Cardinality Constraint 



We first give an offline algorithm for submodular maximization subject to a cardinality constraint: this illustrates 
our simple approach, upon which we build in the following sections. Formally, given a subset X C 17 and a 
non-negative submodular function / that is potentially non-monotone, but has /(0) = 0. We want to approximate 
max5'cx;|5|<fc /('S')- The greedy algorithm starts with 5^0, and repeatedly picks an element e with maximum 
marginal value /s(e) until it has k elements. 

Lemma 2.1. For any set \C\ < k, the greedy algorithm returns a set S that satisfies f{S) > ^ f{S U C). 

Proof. Suppose not. Then fs{C) = f{S U C) — f{S) > f{S), and hence there is at least one element e G C \ S 
that has /^({e}) > j^j^ > "^x^- Since we ran the greedy algorithm, at each step this element e would have been 
a contender to be added, and by submodularity, e's marginal value would have been only higher then. Hence the 
elements actually added in each of the k steps would have had marginal value more than e's marginal value at that 
time, which is more than f{S)/k. This implies that f{S) > k ■ f{S)/k, a contradiction. □ 

This theorem is existentially tight: observe that if the function / is just the cardinality function f{S) = \S\, and 
if S and C happen to be disjoint, then f{S) = ^f{SuC). 

Lemma 2.2 (Special Case of Claim 2.7 in [^MNST^). Given sets C, Si C U, let C = C \ Si, and S2 <^ U \ Si. 
Then f{Si U C) + /{Si n C) + f{S2 U C) > f{C). 

Proof. By submodularity, it follows that f{Si U C) + /(52 U C) > f{Si U S2 U C) + f{C'). Again using 
submodularity, we get f{C') + f{Si H C) > /(C) + /(0). Putting these together and using non-negativity of /(•), 
the lemma follows. □ 

We now give our algorithm Submod-Max-Cardinality 
( figure i ) for submodular maximization: it has the same 



multi-pass structure as that of Lee et al., but uses the greedy 
analysis above instead of a local-search algorithm. 



Theorem 2.3. The algorithm Submod-Max-Cardinality is a 4 
(4 + a) -approximation. 5 



let Xi^ X 
for i = 1 to 2 do 

let Si ^ Greedy (Xi) 
lets,' ^FMV„(5i) 
let X,+i ^ Xi \ Si. 
end for 

return best of Si,S[,S2. 



Proof Let C* be the optimal solution with /(C*) = OPT. 

We know that > i/(5iUC*). Also, if /(5inC7*) is at 

least e OPT, then we know that the a-approximate algorithm 

FMV, gives us a value of at least (e/a)OPT. Else, Fig^^e 1^ Submod-Max-Cardinahty(X, k, f) 

f{Si) > If (Si U C*) > U C*) + nC*)-e OPT/2 (1) 

Similarly, we get that f{S2) > lf{S2 U (C* \ Si)). Adding this to @, we get 

2max(/(5i),/(52))>/(5i) + /(S2) 

> U C*) + f{Si n C*) + f{S2 U (C* \ Si))) - eOPT/2 (2) 

> y{C*) - eOPT/2 (3) 
>i(l-e) OPT. 

where we used [Lemma 2!2| to get from to (^). Hence max{/(5'i), f{S2)} > OPT. The approximation 
factor now is max{a/e, 4/(1 — e)}. Setting e = we get a (4 + a)-approximation, as claimed. □ 

Using the known value of a = 2.5 from Feige et al. [ pMV07[ ], we get a 6.5-approximation for submodular 



maximization under cardinality constraints. While this is weaker than the 3.23-approximation of Vondrak [ |Von09| ], 



or even the 4-approximation we could get from Lee et al. [LMNSIO] for this special case, the algorithm is faster, 



and the idea behind the improvement works in several other contexts, as we show in the following sections. 
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3 Fast Algorithms for p-Systems and Knapsacks 



In this section, we show our greedy-style algorithms which achieve an 0(p) -approximation for submodular maxi- 
mization over p-systems, and a constant-factor approximation for submodular maximization over a knapsack. Due 
to space constraints, many proofs are deferred to the appendices. 



3.1 Submodular Maximization for Independence Systems 

Let be a universe of elements and consider a collection X C 2^ of subsets of il. (0,X) is called an independence 
system if (a) G X, and (b) if X G X and Y (1 X, then y G X as well. The subsets in X ai^e called independent; for 
any set S of elements, an inclusion-wise maximal independent set T of 5 is called a basis of S. For brevity, we say 
that T is a basis, if it is a basis of il. 

Definition 3.1. Given an independence system (ri,X) and a subset 5" C 0. The rank r{S) is defined as the cardinal- 
ity of the largest basis of S, and the lower rank p{S) is the cardinality of the smallest basis of S. The independence 
system is called a p-independence system (or a p-system) if max^cf^ ^|§y < P- 



See, e.g., [SCPVO^ for a discussion of independence systems and their relationship to other families of con- 
straints; it is useful to recall that intersections of p matroids form a p-independent system. 



3.1.1 The Algorithm for p-Independence Systems 

Suppose we aix given an independence system {Q.,T), a subset X C and a non-negative submodular function / 
that is potentially non-monotone, but has /(0) = 0. We want to find (or at least approximate) max^cXiSex /(5'). 
The greedy algorithm for this problem is what you would expect: start with the set S" = 0, and at each step pick 
an element e ^ X\S that maximizes /s'(e) and ensures that + e is also independent. If no such element exists, 
the algorithm terminates, else we set 5 ^ S" + e, and repeat. (Ideally, we would also check to see if /5(e) < 0, 
and terminate at the first time this happens; we don't do that, and instead we add elements even when the marginal 
gain is negative until we cannot add any more elements without violating independence.) The proof of the following 
lemma appears in Section A , and closely follows that for the monotone case from [ ]CCPV09| ]. 



Lemma 3.2. For a p-independence system, if S is the independent set returned by the greedy algorithm, then for 
any independent set C, f{S) > ^^/(C U S). 

The algorithm Submod-Max-p-Systems ( [Figure 2 ) for 
maximizing a non-monotone submodular function / with 
/(0) = over a p-independence system now immediately 
suggests itself. 



Theorem 3.3. The algorithm Submod-Max-p-System is a 
(1 + a)(p + 2 + 1 / p) -approximation for maximizing a non- 
monotone submodular function over a p-independence sys- 
tem, where a is the approximation guarantee for uncon- 
strained (non-monotone) submodular maximization. 



for i - 

s'r 



X 

= 1 to p + 1 do 

- Greedy (Xi,X,/) 

- FMV,(5i) 

1 ^ Xi\ Si 



end for 
return S 



best among {S.}tl U {S^tl 



Figure 2: Submod-Max-p-System(X, X, /) 



Proof Let C* be an optimal solution with OPT = /(C*), 

and let d = C* n Xi for all i G [p + 1]— hence Ci = C*. 

Note that Ci is a feasible solution to the greedy optimization in [Step 3 . Hence, by Lemma 3^ , we know that 
fiSi) > ^f{Ci U Si). Now, if for some i, it holds that f{Si n d) > eOPT (for e > to be chosen later), then 
the guarantees of FMVq, ensure that /(fSQ > (eOPT)/Q!, and we will get a o/e-approximation. Else, it holds for 
all i G [p + 1] that 



f{S^) > ^/(Q U Si) + fid nSi)-e OPT 



(4) 
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Now we can add all these inequalities, divide by p + 1, and use the argument from [ LMNS10| , Claim 2.7] to infer 
that 



f{S) > j^fiC*) - e OPT = OPT ( - e ] . (5) 



(While Claim 2.7 of [ LMNS10| ] is used in the context of a local-search algorithm, it uses just the submodularity of 



the function /, and the facts that {Uj<iSj U C) n (5^ U Q) = Q and {Uj<i{Sj n Cj) U Ci = C for every i.) Thus 
the approximation factor is max{a/e, — e)^^}- Setting e = (p+i)^ ' claimed approximation 

ratio. □ 



Note that even using a = 1, our approximation factors differ from the ratios in Lee et al. [^MNST^ LSVOS] 



by a small constant factor. However, the proof here is somewhat simpler and also works seamlessly for all p- 
independence systems instead of just intersections of mati'oids. Moreover our running time is only linear in the 
number of matroids, instead of being exponential as in the local-search: previously, no polynomial time algorithms 
were known for this problem if p was super-constant. Note that running the algorithm just twice instead of p + 1 
times reduces the run-time further; we can then use |!.emma 2T2 instead of the full power of [ LMNSIO , Claim 2.7], 
and hence the constants are slighdy worse. 

3.2 Submodular Maximization over Knapsacks 



The paper of S viridenko [ Svi04 ] gives a greedy algorithm with partial enumeration that achieves a ^^j-approximation 
for monotone submodular maximization with respect to a knapsack constraint. In particular, each element e G X has 
a size Cg, and we are given a bound B: the goal is to maximize f{S) over subsets 5" C X such that YleeS ^e. ^ B. 
His algorithm is the following — for each possible subset Sq X of at most three elements, start with 5*0 and itera- 
tively include the element which maximizes the gain in the function value per unit size, and the resulting set still fits 
in the knapsack. (If none of the remaining elements gives a positive gain, or fit in the knapsack, stop.) Finally, from 
among these 0(|Xp) solutions, choose the best one — S viridenko shows that in the monotone submodular- case, this 
is an ^j^-approximation algorithm. One can modify Sviridenko's algorithm and proof to show the following result 



for non-monotone submodular functions. (The details are in Appendix B). 



Theorem 3.4. There is a polynomial-time algorithm that given the above input, outputs a polynomial sized collection 
of sets such that for any valid solution C, the collection contains a set S satisfying f{S) > ^f{S U C). 

Note that the tight example for cardinality constraints shows that we cannot hope to do better than a factor of 



1/2. Now using an argument very similar to that in Theorem 2.3 gives us the following result for non-monotone 



submodular maximization with respect to a knapsack constraint. 

Theorem 3.5. There is an (4 + a)-approximation for the problem of maximizing a submodular function with respect 
a knapsack constraint, where a is the approximation guarantee for unconstrained (non-monotone) submodular 
maximization. 



4 Constrained Submodular Maximization in the Secretary Setting 

In this section, we will give algorithms for submodular maximization in the secretary setting: first subject to a 
cardinality constraint, then with respect to a partition matroid, and finally an algorithm for general matroids. The 
main algorithmic concerns tackled in this section when developing secretary algorithms are: (a) previous algorithms 
for non-monotone maximization required local-search, which seems difficult in an online secretary setting, so we 
developed greedy-style algorithms; (b) we need multiple passes for non-monotone optimization, and while that can 
be achieved using randomization and running algorithms in parallel, these parallel runs of the algorithms may have 
correlations that we need to control (or better still, avoid); and of course (c) the marginal value function changes 
over the course of the algorithm's execution as we pick more elements — in the case of partition matroids, e.g., this 
ever-changing function creates several complications. 
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We also show an information theoretic lower bound: no secretary algorithm can approximately maximize a 
submodular function subject to a cardinality constraint to a factor better than some universal constant greater 
than 1, independent of k (This is ignoring computational constraints, and so the computational inapproximability of 
offline submodular maximization does not apply). This is in contrast to the additive secretary problem, for which 



Kleinberg gives a secretary algorithm achieving a ^—gy-^ -approximation [ |Kle05| ]. This lower bound is found in 



Appendix D . (For a discussion about independent work on submodular secretary problems, see { 1.3.1.) 



4.1 Subject to a Cardinality Constraint 



The offline algorithm presented in [Section 2 builds three potential solutions and chooses the best amongst them. 



We now want to build just one solution in an online fashion, so that elements aiTive in random order, and when an 
element is added to the solution, it is never discarded subsequently. We first give an online algorithm that is given the 
optimal value OPT as input but where the elements can come in worst-case order (we call this an "online algorithm 
with advice")- Using sampling ideas we can estimate OPT, and hence use this advice-taking online algorithm in 
the secretary model where elements arrive in random order. 

To get the advice-taking online algorithm, we make two changes. First, we do not use the greedy algorithm 
which selects elements of highest marginal utility, but instead use a threshold algorithm, which selects any element 



that has marginal utility above a certain threshold. Second, we will change [Step 4| of Algorithm Submod-Max- 
Cardinality to use FMV4, which simply selects a random subset of the elements to get a 4-approximation to the 
unconstrained submodular maximization problem [ FMV07| ]. The Threshold Algorithm with inputs (r, /c) simply 



selects each element as it appears if it has marginal utility at least r, up to a maximum of k elements. 

Lemma 4.1 (Threshold Algorithm). Let C* satisfy f{C*) = OPT. The threshold algorithm on inputs (r, k) returns 
a set S that either has k elements and hence a value of at least rk, or a set S with value f{S) > /(5U C*) — |C*|r. 

Proof. The claim is immediate if the algorithm picks k elements, so suppose it does not pick k elements, and also 
f{S) < f{SuC*)-\C*\T. Then fs{C*) > |C*|r, or r < < ^^'^^cf^^^ - averaging, this implies there 

exists an element e G C* such that /5(e) > r; this element cannot have been chosen into S (otherwise the mai^ginal 
value would be 0), but it would have been chosen into S when it was considered by the algorithm (since at that time 
its marginal value would only have been higher). This gives the desired contradiction. □ 

Theorem 4.2. If we change Algorithm Submod-Max-Cardinality from §^to use the threshold algorithm with thresh- 
old T = ^^p- in Step 4 cin d to use the random sampling algorithm FMV4 in Step 4 cind return a ( uniformly) random 



one of Si, S[, S2 in ptep^ the expected value of the returned set is at least OPT/21. 

Proof. We show that f{Si) + f{S[) + f{S2) > rk = , and picking a random one of these gets a third of that 
in expectation. Indeed, if 5i or 5*2 has k elements, then f{Si) + f{S2) > rk. Else if f{Si DC*) > Ark, then FMV4 
guarantees that f{S[) > rk. Else /(5i) + /(^z) > (/(^i U C*) - rfe) + (/(^a U C*) - tA:) + n C*) - 4rA;), 



which by Lemma 272\ is at least OPT — Grk = rk. □ 



Observation 4.3. Given the value of OPT, the algorithm of Theorem 4.2 can be implemented in an online fashion 
where we (irrevocably) pick at most k elements. 

Proof. We can randomly choose which one of Si, S[, S2 we want to output before observing any elements. Clearly 
Si can be determined online, as can S2 by choosing any element that has high marginal value and is not chosen in 
Si. Moreover, S[ just selects elements from Si independently with probability 1/2. □ 



Observation 4.4. In both the algorithms of \Theorems 2.3\ and 4.2 if we use some value Z < OPT instead of OFT, 



the returned set has value at least Z/ (4 + a), and expected value at least Z/21, respectively. 

Finally, it will be convenient to recall Dynkin's algorithm: given a stream of n numbers randomly ordered, it 
samples the first 1/e fraction of the numbers and picks the next element that is larger than all elements in the sample. 

4.1.1 The Secretary Algorithm for the Cardinality Case 



7 



For a constrained submodular optimiza- 
tion, if we are given (a) a poff-approximate 
offline algorithm, and also (b) a pon- 
approximate online advice-taking algorithm 
that works given an estimate of OPT, we 
can now get an algorithm in the secretary 
model thus: we use the offline algorithm to 
estimate OPT on the first half of the ele- 
ments, and then run the advice-taking on- 
line algorithm with that estimate. The for- 
mal algorithm appears in Figure 3. Be- 



cause of space constraints, we have defeiTcd 
the proo f of the following theorem to A p- 
pendix C. 



Let Solution ^ 0. 
Flip a fair coin 
if heads then 

Solution ^ most valuable item using Dynkin's-Algo 
else 

Let m G i?(n, 1/2) be a draw from the binomial distribution 
^1 poff-approximate offline algorithm on the first m elements. 
^2 pon-approximate advice-taking online algorithm with 

f{Ai) as the guess for OPT. 
Return A2 
end if 

Figure 3: Algorithm SubmodularSecretaries 



Theorem 4.5. The above algorithm is an 0{l)-approximation algorithm for the cardinality-constrained submodular 
maximization problem in the secretary setting. 



4.2 Subject to a Partition Matroid Constraint 

In this section, we give a constant-factor approximation for maximizing submodular functions subject to a partition 
matroid. Recall that in such a matroid, the universe is partitioned into k "groups", and the independent sets are 
those which contain at most one element from each group. To get a secretary-style algorithm for modular (additive) 
function maximization subject to a partition matroid, we can run Dynkin's algorithm on each group independently. 
However, if we have a submodular function, the marginal value of an element depends on the elements previously 
picked — and hence the marginal value of an element as seen by the online algorithm and the adversary become very 
different. 

We first build some intuition by considering a simpler "contiguous partitions" model where all the elements of 
each group arrive together (in random order), but the groups of the partition are presented in some arbitrary order 
gi, g2, ■ ■ ■ , gr- We then go on to handle the case when all the elements indeed come in completely random order, 
using what is morally a reduction to the contiguous partitions case. 



4.2.1 A Special Case: Contiguous Partitions 

For the contiguous case, one can show that executing Dynkin's algorithm with the obvious marginal valuation 
function is a good algorithm: this is not immediate, since the valuation function changes as we pick some elements — 
but it works out, since the groups come contiguously. Now, as in the previous section, one wants to run two parallel 
copies of this algorithm (with the second one picking elements from among those not picked by the first) — but the 
correlation causes the second algorithm to not see a random permutation any more! We get around this by coupling 
the two together as follows: 

Initially, the algorithm determines whether it is one of 3 different modes (A, B, or C) uniformly at 
random. The algorithm maintains a set of selected elements, initially 5o. When group gi of the partition 
arrives, it runs Dynkin's secretary algorithm on the elements from this group using valuation function 
fSi-i- If Dynkin's algorithm selects an element x, our algorithm flips a coin. If we are in modes A or 
B, we let Si ^ Si^i U {x} if the coin is heads, and let Si ^ Si-i otherwise. If we are in mode C, 
we do the reverse, and let Si ^ Si-i U {x} if the coin is tails, and let Si ^ Si-i otherwise. Finally, 
after the algorithm has completed, if we are in mode B, we discard each element of Sr with probability 
1/2. (Note that we can actually implement this step online, by 'mai^king' but not selecting elements 
with probability 1/2 when they arrive). 

Lemma 4.6. The above algorithm is a (3 + 6e) -approximation for the submodular maximization problem under 
partition matroids, when each group of the partition comes as a contiguous segment. 
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Proof. We first analyze the case in wliicli tlie algoritlim is in mode A or C. Consider a iiypotiietical run of two 
versions of our algorithm simultaneously, one in mode A and one in mode C which share coins and produce sets 
and 5^. The two algorithms run with identical mai^ginal distributions, but are coupled such that whenever both 
algorithms attempt to select the same element (each with probability 1/2), we flip only one coin, so one succeeds 
while the other fails. Note that ^ U \ S^, and so we will be able to apply [Lemma 2?2 . For a fixed permutation 



IT, let ^^(Tr) be the set chosen by the mode A algorithm for that paiticular permutation. As usual, we define 
fA{B) = f{A UB)- f{A). Hence, /(5,^(7r)) = fiS^iir) U C*) - /5A(,)(C*), and taking expectations, we get 

E[f{Sf)] = E[f{Sf U C*)] - E[fsA{C*)] (6) 
Now, for any e £ X, let j{e) be the index of the group containing e; hence we have 

eeC* eeC* eeC* 

= 2e-E[/(5,^)], (7) 

where the first inequality is just subadditivity, the second submodularity, the third follows from the fact that Dynkin's 
algorithm is an e-approximation for the secretary problem and selecting the element that Dynkin's selects with proba- 
bility 1/2 gives a 2e approximation, and the resulting telescoping sum gives the fourth equality. Now substituting (^) 
into (^) and rearranging, we get E[/(5'^)] > ^-^2^ f{S:^ U C*). An identical analysis of the second hypothetical 
algorithm gives: E[f{S^)] > ^ f{S^ U C* \ 5,^). 

It remains to analyze the case in which the algorithm runs in mode B. In this case, the algorithm generates a set 
5^ by selecting each element in uniformly at random. By the theorem of [ pMVOTI ], uniform random sampling 
achieves a 4-approximation to the problem of unconstrained submodular maximization. Therefore, we have in 
this case: E[f {S^)] > {/{S^ n C*). By [Lemma 2.2] we therefore have: E[f{S^)] + E[f{S^)] + E[f{S^)] > 



j^f{C*). Since our algorithm outputs one of these thi-ee sets uniformly at random, it gets a (3+6e) approximation 
to fiC*). □ 



4.2.2 General Case 

We now consider the general secretary setting, in which the elements come in random order, not necessarily grouped 
by partition. Our previous approach will not work: we cannot simply run Dynkin's secretary algorithm on contiguous 
chunks of elements, because some elements may be blocked by our previous choices. We instead do something 
similar in spirit: we divide the elements up into k 'epochs', and attempt to select a single element from each. We 
treat every element that amves before the cunent epoch as part of a sample, and according to the cuiTcnt valuation 
function at the beginning of an epoch, we select the first element that we encounter that has higher value than any 
element from its own partition group in the sample, so long as we have not already selected something from the 
same partition group. Our algorithm is as follows: 

Initially, the algorithm determines whether it is one of 3 different modes (A, B, or C) uniformly at 
random. The algorithm maintains a set of selected elements, initially 5o, and observes the first Nq ~ 
i?(n, i) of the elements without selecting anything. The algorithm then considers k epochs, where the 
ith epoch is the set of N-i ~ B{n, contiguous elements after the (i — l)th epoch. At epoch i, we use 
valuation function fsi^i- If an element has higher value than any element from its own partition group 
that arrived earlier than epoch i, we flip a coin. If we are in modes A or B, we let Si ^ Si-i U {x} 
if the coin is heads, and let Si ^ Si-i otherwise. If we are in mode C, we do the reverse, and let 
Si ^ Si^i U {x} if the coin is tails, and let Si ^ Si^i otherwise. After all k epochs have passed, 
we ignore the remaining elements. Finally, after the algorithm has completed, if we are in mode B, we 
discai^d each element of 5, with probability 1/2. (Note that we can actually implement this step online, 
by 'marking' but not selecting elements with probability 1/2 when they anive). 

If we were guai^anteed to select an element in every epoch i that was the highest valued element according to 
fSi^i, then the analysis of this algorithm would be identical to the analysis in the contiguous case. This is of course 
not the case. However, we prove a technical lemma that says that we are "close enough" to this case. 



9 



Lemma 4.7. For all partition groups i and epochs j, the algorithm selects the highest element from group i (ac- 
cording to the valuation function fsj-x used during epoch j) during epoch j with probability at least 



Because of space constraints, we defer the proof of this technical lemma to [Appendix C . 

Note an immediate consequence of the above lemma: if e is the element selected from epoch j, by summing 
over the elements in the optimal set C* (1 from each of the k partition groups), we get: 

E[/5,_,(e)] > fsU^') > n{^^^^^) 

Summing over the expected contribution to Sr from each of the k epochs and applying submodularity, we get 
E[/5-4(c;*)] < 0{E[f{S^)]). Using this derivation in place of inequality ^ in the proof of Lemma 4^ proves that 



our algorithm gives an 0(1) approximation to the non-monotone submodular maximization problem subject to a 
partition matroid constraint. 



4.3 Subject to a General Matroid Constraint 

We consider matroid constraints where the matroid is = with rank k. Let wi = maxggn /({e}) the 

maximum value obtained by any single element, and let ei be the element that achieves this maximum value. (Note 
that we do not know these values up-front in the secretary setting.) In this section, we first give an algorithm that gets 
a set of fairly high value given a threshold r. We then show how to choose this threshold, assuming we know the 
value wi of the most valuable element, and why this implies an advice-taking online algorithm having a logarithmic 
approximation. Finally, we show how to implement this in a secretary framework. 

A Threshold Algorithm. Given a value r, run the following algorithm. Initialize Si, S2 ^ 0. Go over the elements 
of the universe Q in arbitrary order: when considering element e, add it to Si if /si(e) > ct and Si U {e} is 
independent, else add it to 5*2 if /52(c) > and ^2 U {e} is independent, else discard it. (We will choose the value 
of e later.) Finally, output a uniformly random one of Si or 5*2. 

To analyze this algorithm, let C* be the optimal set with f{C*) = OPT. Order the elements of C* by picking 
its elements greedily based on mai^ginal values. Given r > 0, let C* C C* be the elements whose mai^ginal benefit 
was at least r when added in this greedy order: note that f{C*) > \C*\t. 

Lemma 4.8. For e = 2/5, the set produced by our algorithm has expected value is at least \C*\ • r/10. 

Proof. If either \Si\ or 1521 is at least |C*|/4, we get value at least |C*|/4 • er. Else both these sets have small 
cardinality. Since we are in a matroid, there must be a set ^ C C* of cardinality |A| > |C*| — jSil — |S'2| > \C*\/2, 
such that A is disjoint from both Si and ^2, and both SiVJ A and 52 U ^ lie in Z (i.e., they are independent). 

We claim that f{Si) > f{Si U ^) — \A\ ■ er. Indeed, an element in e G A was not added by the threshold 
algorithm; since it could be added while maintaining independence, it must have been discarded because the marginal 
value was less than er. Hence /^^ ({e}) < er, and hence /(5iUA) — /(^i) = fsi{A) < J2eeA fsiii^}) < l^l"^''"- 
Similarly, /(S'2) > /(S'2 U ^) — \A\ ■ er. And by disjointness, f{Si n ^) = /(0) = 0. Hence, summing these and 
applying Eim^T|, we get that /(Si) + /(52) > /(5i U A) + /(S2 U^) + /(5i n A) - 2eT|^| > f{A)-2€T\A\. 

Since the marginal values of all the elements in C* were at least r when they were added by the greedy ordering, 
and A C C*, submodularity implies that f{A) > \A\t, which in turn implies f{Si) + f{S2) > (1 - 2e)T|yl| > 
(1 — 2e)r|C*|/2. A random one of ^i, S'2 gets half of that in expectation. Taking the minimum of |C*|/4 • er and 
(1 - 2e)r|C*|/2 and setting e = 2/5, we get the claim. □ 

Lemma 4.9. ^tf ' \C*^,/2^ | • f > /(C*)/4 = OPT/4. 

Proof. Consider the greedy enumeration {ei, 62, ... , et} of C, and let Wj = /{ei,e2,...,ei_i}({cj})- First consider an 
infinite summation Yl'^o \ ^wi/2i \ ' ^ — e,a.ch element ej contributes at least Wj/2 to it, and hence the summation 
is at least i Wj. But f{C*) = Yl]=i ^j' which says the infinite sum is at least /(C*)/2 = OPT/2. But the 
finite sum merely drops a contribution of wi/Ak from at most |C*| < k elements, and clearly OPT is at least wi, 
so removing this contribution means the finite sum is at least OPT/4. □ 
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Hence, if we choose a value r uniformly from wi^wi/2,wi/ 4:, . . . ,wi/2k and run the above threshold algorithm 
with that setting of r, we get that the expected value of the set output by the algorithm is: 

v^log2fc I wi ^ 1 OPT (Q\ 

' Z^i=0 l'-^i«i/2»l ■ 10-2» — H-log2fc 40 • 



The Secretary Algorithm. The secretary algorithm for general matroids is the following: 

Sample half the elements, let W be the weight of the highest weight element in the first half. Choose a 
value i G {0, 1, . . . , 2 + log 2k} uniformly at random. Run the threshold algorithm with W/2^ as the 
threshold 

Lemma 4.10. The algorithm is an 0(log k)-approximation in the secretary setting for rank k matroids. 

Proof. With probability 0(1/ log k), we choose the value i = 0. In this case, with constant probability the element 
with second-highest value comes in the first half, and the highest-value element ei comes in the second half; hence 
our (conditional) expected value in this case is at least wi. In case this single element accounts for more than half 
of the optimal value, we get r2(OPT/ log fe). We ignore the case i = 1. If we choose i > 2, now with constant 
probability ei comes in the first half, implying that W = wi. Moreover, each element in C — ei appears in the 
second half with probability slightly higher than 1/2. Since ei accounts for at most half the optimal value, the 
expected optimal value in the second half is at least OPT/4. The above argument then ensures that we get value 
0(OPT/ log k) in expectation. □ 
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A Proof of Main Lemma for ]9-Systems 

Let ei , 62 , . . . , efc be the elements added to S by greedy, and let Si be the first i elements in this order, with 5i = 
fSi-i ({ej) = f{Si) - /(S'i-i), which may be positive or negative. Since /(0) = 0, we have f{S = Sk) = Yli ^i- 
And since / is submodular, 6i > 6i+i for all i. 



Lemma A.l (following [ CCPV09 ]). For any independent set C, it holds that f{Sk) > -^f{C U Sk). 



Proof. We show the existence of a partition of C into Ci, C2, . . . , with the following two properties: 

• for all i G [k],pi + P2 + ■ ■ ■ + Pi < i ■ P where pi := |Cj|, and 

• for alH G [k], pi5i > fs^ iQ). 

Assuming such a partition, we can complete the proof thus: 

pT.^^^ T.P^^^ ^ E^^^(^^) ^ ^^^^(^) = ^(^fc U C) - f{Sk), (9) 



where the first inequality follows from [CCPV09, Claim A.l] (using the first property above, and that the 5's are 



non-increasing), the second from the second property of the partition of C, the third from subadditivity of fs^ 



(which is implied by the submodularity of / and applications of both facts in Proposition 1.1), and the fourth from 
the definition of fs^{')- Using the fact that 6i = f{Sk), and rearranging, we get the lemma. 

Now to prove the existence of such a partition of C. Define ^1, ■ ■ ■ as follows: Ai = {e C \ Si \ 
5j + e G X}. Note that since C G X, it follows that Aq = C; since the independence system is closed under subsets, 
we have Ai C Ai^i, and since the greedy algorithm stops only when there are no more elements to add, we get 
Ak = 0. Defining Ci = \ Ai ensures we have a partition Ci, C2, . . . , Cfc of C 

Fixavaluei. We claim that S'j is a basis (a maximal independent set) for 5jU (C1UC2U. . .UCj) = SiU{C\Ai). 
Clearly Si G X by construction; moreover, any e G (C \ Ai) \ Si was considered but not added to Ai because 
Si + e X. Moreover, (Ci U C2 U . . . U C,) C C is clearly independent by subset-closure. Since X is a p- 
independence system, | Ci U C2 U . . . U Cj | < p ■ \Si\, and thus 1^1 = Yli Pi ^ ' proving the first property. 

For the second property, note that Ci = Ai^i \ Ai C Ai^i; hence each e G Cj does not belong to Si^i 
but could have been added to S'j„i whilst maintaining independence, and was considered by the greedy algorithm. 
Since greedy chose the e/ maximizing the "gain", 6i > for each e G C/. Summing over all e G Ci, we 

get pi6i < Yleec, fs,-i{{e}) < fs^-ACi), where the last inequality is by the subadditivity of fs,_,. Again, by 
submodularity, fsi^-i {Ci) < fs^ (Ci), which proves the second fact about the partition {Cj}j^i of C. □ 

Clearly, the greedy algorithm works no worse if we stop it when the best "gain" is negative, but the above proof 
does not use that fact. 



B Proofs for Knapsack Constraints 

The proof is similar to that in [ |Svi04| ] and the proof of Lemma 2. 1 . We use notation similar to [ }Svi04 ] for consistency. 
Let / be a non-negative submodular function with /(0) = 0. Let / = [n], and we are given n items with weights 
Ci G Z+, and i? > 0; let X" = {S C / | c{S) < B, where c{S) = J2i&s ^i- S^^^ ^ solve max^cj^" f{S). To 
that end, we want to prove the following result: 
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Theorem B.l. There is a polynomial-time algorithm that outputs a collection of sets such that for any C ^ T, the 
collection contains a set S satisfying f{S) > ^f{S U C). Q 

B.l The Algorithm 

The algorithm is the following: it constructs a polynomial number of solutions and chooses the best among them 
(and in case of ties, outputs the lexicographically smallest one of them). 

• First, the family contains all solutions with cardinality 1, 2, 3: cleai^ly, if \C\ < 3 then we will output C itself, 
which will satisfy the condition of the theorem. 

• Now for each solution [/ C / of cardinality 3, we greedily extend it as follows: Set Sq = U, Iq = I. At step t, 
we have a partial solution St-i- Now compute 

/(S.-i+.)-/(5.-0^ 

i&It^i\St-i Ci 

Let the maximum be achieved on index it. If 6t < 0, terminate the algorithm. Else check if c(S't_i + it) < B: 
if so, set St = St^i + it and It = else set St = St^i and It = It-i — it- Stop if It\ St = 0. 

The family of sets we output is all sets of cardinality at most three, as well as for each greedy extension of a set of 
cardinality three, we output all the sets St created during the run of the algorithm. Since each set can have at most n 
elements, we get O(n^) sets output by the algorithm. 

B.2 The Analysis 

Let us assume that |C| = t > 3, and order C as ji,j2, ■ ■ ■ ,jt such that 

jec\{ji,...,jfe_i} 

i.e., index the elements in the order they would be considered by the greedy algorithm that picks items of maximum 
marginal value (and does not consider their weights Cj). Let Y = {ji,j2,j3}- Submodularity and the ordering of C 
gives us the following: 

Lemma B.2. For any jk G C with k> A and any Z O I \ {ji , ^2, js, Jfc}. it holds that: 

fvuzajk}) < f{{jk}) < /({ji}) 

fYUz{{jk}) < fiblJk}) - /({Jl}) < /({Jl, J2}) - /({il}) 
fYUz{{jk}) < /({il,j2, jfc}) - /({il,j2}) < f{{jl,j2,h}) - /({il,j2}) 

Summing the above three inequalities we get that for jk ^ Y U Z, 

^fYuz{{jk})<f{Y). (12) 

For the rest of the discussion, consider the iteration of the algorithm which starts with 5o = Y. For 5 such that 
5o = y C 5 C /, recall that /y (5) = f{Y US)- f{Y) = f{S) - /(Y) proposition 1.1| shows that /y(-) is a 



submodular function with /y(0) = 0. The following lemma is the analog of [ |Svi04 , eq. 2]: 
Lemma B.3. For any submodular function g and all 5, T C / it holds that 

g{T US)< g{S) + J] ig{S + i) - g{S)) (13) 

ieT\S 



'a preliminary version of the paper claimed a factor of (1 — 1/e) instead of 1/2 — we thank C. Chekuri for pointing out the error. 
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Proof. g{T U S) = g{S) + (^(r US)- g{S)) = g{S) + gs{T \ S) < g{S) + E^^T\s 9s{{i}) = 9{S) + 
J2i^T\s(9(^ + i) — g{S)), where we used subadditivity of the submodular function gs. □ 

Let r + 1 be the first step in the greedy algorithm at which either (a) the algorithm stops because Ot^i < 0, 
or (b) we consider some element i-^+i G C and it is dropped by the greedy algorithm — i.e., we set Sr+i = Sr 
and I-r+i = It — ir+i- Note that before this point either we considered elements from C and picked them, or the 
element considered was not in C. In fact, let us assume that there are no elements that are neither in C nor are picked 
by our algorithm, since we can drop them and perform the same algorithm and analysis again, it will not change 
anything — hence we can assume we have not dropped any elements before this, and St = {ii,i2, - ■ ■ ,it} for all 
t G {0,1,...,t}. 



Now we apply [Lemma B.3| to the submodular function /y(-) with sets S = St and T = C to get 



fY{CuSt)<fYiSt)+ Yl friSt + i)- fY{St) = fY{St)+ Yl f{St + i)- f{St) 

iec\St iec\St 



(14) 



Suppose case (a) happened and we stopped because 6r+i < 0. This means that every term in the summation in ( |T4| ) 
must be negative, and hence /y(C U Sr) < JyISt), or equivalently, f{C U Sr) < f{Sr)- In this case, we are not 
even losing the (1 — 1/e) factor. 

Case (b) is if the greedy algorithm drops the element ir+i € C. Since Zt-^i was dropped, it must be the case that 
c{Sr) < B but c{Sr + ir+i) = B' > B . In this case the right-hand expression in ([T^ has some positive terms for 
each of the values of t < r, and hence for each t, we get 



fY{CuSt)<fY{St) + B 
To finish up, we prove a lemma similar to [Lemma 2.1 . 



(15) 



Lemma B.4. /y (5^ + v+i) > \ fyiSr U C). 
Proof. If not, then we have 

friSr UC)- fviSr + ir+l) > /y(5, + ir+l). 

Since we ai^e in the case that 9r+i > 0, we know that /y (Sr + v+i) > /y(5'r), and hence 

fYUSriC) = fYiSr U C) - fviSr) > /y(5r + i.+l). 



> 



B 



Now, the subadditivity of fyuSrO implies that there exists some element e G C with 
Submodularity now implies that at each point in time i < t + I, the marginal increase per unit cost for element e is 

/yuSj(e) ^ /y(5r+ir+i) 



B 



Now since the greedy algorithm picked elements with the largest marginal increase per unit 
cost, the marginal increase per unit cost at each step was strictly greater than (•S't+v+i) ^ jjgj^(;g^ ^{ {j^g moment the 
total cost of the picked exceeded B, the total value accrued would be strictly greater than /y (5,- + ir+i), which is a 
contradiction. □ 

Now for the final calculations: 

f{Sr)>fiY) + fY{Sr) 

> f{Y) + fY{Sr + V+l) - {fY{Sr + V+l) " /y(5r)) 

> fiY) + fYiSr + ir+l) - ifiSr + V+l) " /(Sr)) 
>f{Y) + {l/2)fY{SrUC)-fiY)/3 

> {l/2)f{SrUC). 



(using Lemma B.4 and ([T2[)) 
(by the definition of /y()) 



Hence this set Sr will be in the family of sets output, and will satisfy the claim of the theorem. 
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C Proofs from the Submodular Secretaries Section 



In this section, we give the missing proofs from Section 4 . 



C.l Proof for Cardinality Constrained Submodular Secretaries 



rheorem 4.5| The algorithm for the cardinality-constrained submodular maximization problem in the secretary 



setting gives an 0(1) approximation to OPT. 

The proof basically shows that with reasonable probability, both the first and the second half of the stream have a 
reasonable fraction of OPT, so when we run the offline algorithm on the first half, using its output to extract value 
from the second half gives us a constant fraction of OPT. 

Proof. Let C* = {ei, . . . ,6^/} denote some set with k' < k elements such that /(C*) = OPT. Without loss 
of generality, we normalize so that OPT = 1. Suppose the elements of C* have been listed in the "greedy or- 
der" (i.e., in order of decreasing marginal utility), and let Oj denote the marginal utility of Cj when it is added to 
{ei, 62, ... , ej_i}. We consider two cases: in the first case, ai > 1/c, where c > 1 is some constant to be deter- 
mined. In this case, with probability l/2e, the algorithm runs Dynkin's secretary algorithm and selects ai, achieving 
an l/(2ce) approximation. 

In the other case, < 1/c for all i. We imagine randomly partitioning the elements of the input set X into two 
sets, Xi and X2, with each element belonging to Xi independently with probability 1/2. This corresponds to the 
algorithm's division of a into the first (random) m elements <t,„ and the remaining elements a — £r,„. Let C* and C2 
denote the optimal solutions restricted to sets Xi and X2 respectively. Define the random variable A = X^f^^ 
where each Yi {-1, 1} is selected uniformly at random. Note that by submodularity, /(Cj" ) + /(C2 ) > f{C*) = 
1. We wish to lower bound min(/(Cj*), /(C|)), and to do this it is sufficient to upper bound the absolute value \ A\. 
To see this, suppose that, for some setting of the Fj's it holds that J2rY =i — Si-y =-1 (^^^ other case is 
identical). Now if |^| = J2rY=i ~ Si-y =-1 '^i — ^' have: 

a-i > {'^ ai) - X = I - { ^ ai) -X 

i:Yi=-l i:Yi=l i:Yi=-l 

and hence 

i■■Y^=-l 

Hence, we would like to upper bound \A\ with high probability. Since each Yi is independent with expectation 0, we 
have E[A] =0 and EfA^] = Y!i=i '^f- The standard deviation of A is: 



i=l 

By Chebyshev's inequality, for any d > 0, we have 



__}_ 

c \fc 



That is, except with probability l/d^, min(/(Ci*), /(C^)) > (1 - ^)/2. Now for some calculations. With 
probability 1/2, we do not run Dynkin's algorithm. Independently of this, with probability 1/2, f{C\) < /(C^) — 
i.e., the value min(/(C*), /(Cg)) is achieved on a^- With probabihty (1 — l/d?), this value is at least (1 — 

Now we run a /?off-approximation on a^, and thus with probabihty |(1 — l/d^), 

2 y/C Poff 
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If we use this as a lower bound for /(C|) (which is fine since we are in the case where f{Ai) < /(CJ") < /(C|)), 
the semi-online algorithm gives us a value of at least ^Mll. Hence we have 

E[/(^2)] > ^(1 - 4) • — • 7(1 - l/rf') • — • (16) 
Combining both cases and optimizing over parameters d and c{d ^ 3.08, c ^ 260.24) we have: 

□ 



C.2 Proof for Partition Matroid Submodular Secretaries 

Let 5*0 be the set of first Nq elements, and let Sj denote the elements in epoch j. Since the input permutation 
itself is random, the distribution over the sets Sq, . . . ,Sk is identical to one resulting from the following process: 
each element e independently chooses a real number in (0, 1) and is placed in if rg < ^, and in Sj if 

e + \ + We shall use this observation to simplify our analysis. 

For the following lemma, we need to keep track of several events: 

1. Hij-. The highest element from partition group i defined under the valuation function used during epoch j 
falls into epoch j. 

2. Fij: The highest element from partition group i among those seen until the end of epoch j (defined under the 
valuation function used during epoch j) falls into epoch j. 

3. Lij: The highest element from partition group i defined under the valuation function used during epoch j 
does not fall before epoch j. 

4. Si,j: The second highest element (if any) from partition group i defined under the valuation function used 
during epoch j falls before epoch j. 

5. Pij: Some element from partition group i has already been selected before epoch j. 

In the definitions above, we assume that a fixed tie breaking rule is used to ensure that there is a unique highest and 
second highest element. 

Lemma For all partition groups i and epochs j, the algorithm selects the highest element from group i 
(according to the valuation function during epoch j) during epoch j with probability at least ^{\)- Specifically: 

Vi[{Hi^j A A -Pij) A (/\ -F,,j)] = ^{\) 
where the probability is over the random permutation of the elements. 

Proof. We observe that the event {Hij A Sij A ~'Pij) A implies that algorithm selects the highest 

element from group i in epoch j. We will lower bound the probability of this event. We will show this by considering 
the events Pij, Sij, Lij, AiYi ~'^i'd^^i,j ^^^^ order, and lower bound the probability of each conditioning on 
the previous ones. 

Under any (arbitrary) valuation function, the events Li j and Sij depend on the real numbers chosen by the 
highest and the second highest elements. Thus Pr[Ljj A Sij] > — 1^) > -g. 

Let Qij denote the number of elements from group i that do not appear in 5*0, ... , S'j-i, but are higher (under the 
valuation function at epoch j) than any group i element in 5*0, ... , Sj^i. It is easy to see that the random variable Qij 
is dominated by a geometric random variable with parameter ^. Moreover, for any element e contributing to Qij, it 
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appears in epoch j with probabihty at most ^ > ^ so that Pr[Fij] < < 25^- Since Pij C Uj'<jFjj, 

2 ^ lOOfe 

we conclude that Pr [Pi J ] < Ej'<jPr[Fij] < ^. It follows that Pr[Ljj A SijhPij] > Pr[Lij A S^j] - Pr[Pij] > 
i_J_ — J. 

5 20 ~ 20- 

For convenience, let us define event £ij = {Lij A Sij A -^Pij). We have: 

3 1 57 

Pr[£:i,,-] = Pr[L,, A S^.^P,,] Pr[^P,,] >-(!--) = _ 

We next upper bound the probability that groups i' ^ i have elements in epoch j that the algorithm might select, 
conditioned on £ij. With Qit j detined as above, we have 

Pr[F,, A£^ ,1 < E\Q^, A£, ,1 • — < ^^^A . ^ < 

Since there are at most k groups i' , by a union bound: Pr[\/^,_^^ Fj/ jl^'j^] < and so: Pr[/\j,_^j -iFj/ > 
|y. Consequently: 

Pr[£:,, A (A -i^.',,)] > Pr[^:.,] • Pr[ A -^^.'.1^..] ^ ^ " ^ = 

To complete the proof, we observe Pr[ifjj|fjj- A (AiYj — ~ iOfc ^'^^ 

37 1 37 
Pr[i7,,A£:,,A(A-i^.^,)]>77i7T•77iI: 



400 40A: 16000fc 



□ 



D Lower Bounds for the Constrained Submodular Maximization Problem in the 
Secretary Setting 

In this section we show lower-bounds for the secretary problem over submodular functions. We first note that 



Kleinberg [ ]Kle05| ] showed that for additive functions, the maximization problem in the on-line setting with a k- 
uniform matroid constraint can be approximated within a factor of 1 — We show that this is not the case for 
submodular functions, even in the information theoretic, semi-online setting (where the algorithm knows the value 
of OPT) by exhibiting a gap for arbitrarily large k. 



Theorem D.l. No algorithm approximates submodular maximization in the semi-online setting with a k-uniform 

11 

18- 



matroid constraint better than a factor of | for k = 2 or for any even k. 



No non-trivial bound is possible for k = 1 because the algorithm knows OPT. Thus the standard secretary lower 
bounds will not work. 

Let R, S, be two finite sets such that S <^ R. We define the COVEr(/?, S) as follows: define the universe to be 
U = {ij : i € R, j G {B, T}}, define the set of elements W to contain in = {iB} for i G i? and ztb for i = S. 
Define a submodular function /(C) = | Usee "^l where C C W. 

We first prove the case for A; = 2 with a small example and case analysis. Consider C0VEr({1, 2}, {r}), where 
r G {1,2}. The universe is U = {IB,1T,2B,2T}. The three elements are 1b = {IB}, 2b = {2B} and 
TTB = {iB, iT}. 

We will chose a uniformly random r G {1,2} and in the semi-online setting will require the algorithm to pick at 
most = 2 of the sets appearing in random order, while trying to maximize /. Let f = 3 — r, then the offline OPT 
is C* ={rTB,fB} with fiC*) = 3 
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Figure 4: Illustration of cover({1, 2}, {2}) 



Claim D.2. No algorithm has expected payoff greater than | on the instance C0VEr({1, 2}, {r}) in the semi-online 
setting when r is drawn uniformly at random. 



Because OPT = 3, Claim |D.2| implies no algorithm can do better than | fraction of OPT, which gives us the 
first part of the theorem. 

Proof. We proceed by case analysis. In the case where the first element that arrives is vtb, the algorithm knows r 
and can obtain OPT = 3. This happens with probability i. 

In the case where the first element that arrives is 1^, the algorithm can accept or reject the element. If the 
algorithm rejects, then it may as well take the next two elements that arrive. Since r = 1 with probability half, the 
expected payoff is at most |. 

Now suppose the algorithm accepts 1^ in the the first position. The algorithm should now pick rx b (and reject 
2^ if it comes before r^B) because the marginal value of rq-B is at least as large as that of 2b- Since r is random, 
this marginal value is | in expectation, and hence the expected payoff of the algorithm is once again |. 

5 
2 

2 ' 3 2 ~ 3- 



Similarly, if the first element is 2b, the payoff is bounded by | in expectation. Thus the total expected payoff of 



the algorithm is bounded byi-3 + i- | + i^-^ 



□ 



Now we would like to show that something similar- is true for much larger k. The basic idea is to combine many 
disjoint instances of COVEr({1, 2}, {r}), and show that if the algorithm does well overall, it must have done well 
on each instance, violating Claim p.2[ 



Claim D.3. For any even k, no algorithm has expected payoff more than j^k in the semi-online setting on instances 
of C0VEr({1, . . . ,k}, S) trying to maximize / and restricted to picking k sets, when S is drawn uniformly at random 
among subsets of {1, . . . , /c} with k/2 elements. 

Because OPT = 3k /2., Claim p. 2| implies no algorithm can do better than a fraction of OPT, which gives 
us the second part of the theorem. 

Proof. For the sake of analysis, we think of the instance of C0VEr({1, . . . , k}, S) being created by first choosing a 
matching on the set {1, . . . , A;} and then within each edge e = of the matching choosing Cr S to include 
in S. 

We can then think of the sets of C0VEr({1, . . . , k}, S) being generated by taking the sets of the instance 
COVEr({z, j}, {e^}) for each edge e = in the matching. Call each of these k/2 instances of COVER({i, j}, {cr}) 
a puzzle- 
Fix an semi-online algorithm A. Let C be the set of elements chosen by A. For < i < 3, let Pi be the set 
of puzzles such that C contains exactly i elements from the puzzle; let x-i be the expected sizes of Pi (over the 
randomness of the assignments of puzzles, the ordering, and A); and let Ei be the expected payoff from all the 
puzzles in Pi. Note that £'0 = and E3 = 3x^. 
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Claim DA. Ei + E2 < ^ - 2x3 

Proof. Given an instance of C0VER({1, 2}, {r}), construct a random instance of COVEr({1, . . . , k}, S) by gener- 
ating a random matching and randomly picking a special edge e = (i, j), where i and j are randomly ordered. For 
each edge e' = pick G {^'1 j'} to include in S. Now run A on this instance of C0VEr({1, . . . , A;}, S), 

except than whenever an element of the puzzle corresponding to edge e comes along, replace it with the next element 
from the given instance of COVEr({1, 2}, {r}); however replace 1 with i, and 2 with j. Run A on this instance of 
COVEr({1, . . . , A;}, 5), and wheneven A chooses an element from the instance of C0VEr({1, 2}, {r}), choose that 
element (it may be that A selects more than 2 elements, in which case, just select the first 2). 

This instance of COVEr({1, . . . ,fc},5) has the same distribution as in the claim, and the given instance of 
C0VER({1, 2}, {r}) is a random puzzle in this distribution. Thus, the expected payoff of the C0VEr({1, 2}, {r}) 



instance is at least {Eq + Ei + E2 + 2x3 )/(§). By Claim p.2| this is < | . Recalling Eq = and rearranging gives 



us the claim. □ 
We combing the above claim with the fact that Eq = Q and £'3 = 8x3 to get that 

E[/(C)] = Ei + E2 + E-i<h + xs. (18) 

Note also that A receives payoff at most 2 from any puzzle in Pi, and at most from any puzzle in Pq. The 
maximum payoff from each puzzle is 3, which occurs in OPT. Thus 

E[f{C)]<3k/2-3xo-xi. (19) 

Finally, there are k/2 puzzles, so xq + xi + X2 + X3 = k/2. Additionally, because the algorithm never picks 
more than k elements, we have xi + 2x2 + 3x3 < k. Solving the first equation for X2 and substituting for X2 in the 
second we get 

0<2xo + xi-X3. (20) 



Adding Equations |T8[ |l|, and we see that 2E[/(C)] < ^A; - xq which implies that E[/(C)] < ^k. □ 
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